Surprise-Based Intrinsic Motivation for Deep Reinforcement Learning
نویسندگان
چکیده
Exploration in complex domains is a key challenge in reinforcement learning, especially for tasks with very sparse rewards. Recent successes in deep reinforcement learning have been achieved mostly using simple heuristic exploration strategies such as -greedy action selection or Gaussian control noise, but there are many tasks where these methods are insufficient to make any learning progress. Here, we consider more complex heuristics: efficient and scalable exploration strategies that maximize a notion of an agent’s surprise about its experiences via intrinsic motivation. We propose to learn a model of the MDP transition probabilities concurrently with the policy, and to form intrinsic rewards that approximate the KL-divergence of the true transition probabilities from the learned model. One of our approximations results in using surprisal as intrinsic motivation, while the other gives the k-step learning progress. We show that our incentives enable agents to succeed in a wide range of environments with high-dimensional state spaces and very sparse rewards, including continuous control tasks and games in the Atari RAM domain, outperforming several other heuristic exploration techniques.
منابع مشابه
How Students’ Views on Educational Factors Influence Their Achievement Motivation and Learning Approaches? Comparison of Perspectives
This comparative study was conducted to explore achievement motivation and learning approaches of agricultural students and to examine students’ views on educational factors influencing their achievement motivation and learning approaches. The statistical population of this study comprised agricultural students of Tehran University (Tehran, Iran) and Ghent University (Belgium). A sample of 89 a...
متن کاملEffect of Type of Feedback on Intrinsic Motivation and Learning of Volleyball Jump Serve in Students with Different Levels of Neuroticism
Background. Several researchers have studied the effects of type of feedback on learning motor skills, but there are few studies on the interaction between personality traits and the type of feedback. Objectives. This study aimed at investigating the effect of type of feedback on intrinsic motivation and learning volleyball jump serve in students with neuroticism. Methods. A total of 59 femal...
متن کاملOperation Scheduling of MGs Based on Deep Reinforcement Learning Algorithm
: In this paper, the operation scheduling of Microgrids (MGs), including Distributed Energy Resources (DERs) and Energy Storage Systems (ESSs), is proposed using a Deep Reinforcement Learning (DRL) based approach. Due to the dynamic characteristic of the problem, it firstly is formulated as a Markov Decision Process (MDP). Next, Deep Deterministic Policy Gradient (DDPG) algorithm is presented t...
متن کاملنقش واسطهای انگیزش درونی در رابطه بین نیازهای بنیادین روانشناختی و رضایت از زندگی
The aim of the present study was to investigate the mediating role of intrinsic motivation on the relationship between students’ basic psychological needs and life satisfaction. This was done in the framework of causal model and was based on self-determination theory. For this reason 292 graduate students of Tehran University (145 boys, 147 girls) completed an Overall Life Satisfaction (O...
متن کاملHierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation
Learning goal-directed behavior in environments with sparse feedback is a major challenge for reinforcement learning algorithms. The primary difficulty arises due to insufficient exploration, resulting in an agent being unable to learn robust value functions. Intrinsically motivated agents can explore new behavior for its own sake rather than to directly solve problems. Such intrinsic behaviors...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1703.01732 شماره
صفحات -
تاریخ انتشار 2017